Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse #1097

Merged
merged 165 commits into from
Apr 26, 2024
Merged

Clickhouse #1097

merged 165 commits into from
Apr 26, 2024

Conversation

Pipboyguy
Copy link
Collaborator

@Pipboyguy Pipboyguy commented Mar 16, 2024

Description

Implements a basic ClickHouse destination target, along with an adapter to specify table engine type.

Related Issues

Additional Context

This PR only implements the MergeTree and ReplicatedMergeTree engines from the MergeTree family.

Features NOT Implemented in this PR

  • Any table engine not specified in the mentioned table engines.
  • Custom sorting keys (ORDER BY when creating tables) - Only useful when using the SummingMergeTree and AggregatingMergeTree table engines. See MergeTree family docs. Note that sorting key, by default, is implicitly equal to the primary anyway.
  • Clustering (future work for adapter).
  • Partitioning (future work for adapter).
  • SETTINGS clause (future work for adapter).
  • Taking advantage of multiple replicas: table functions like s3Cluster take advantage of multiple replicas with parallel inserts. Managed services like Clickhouse Cloud will most benefit from this.

Things to keep in mind for future work

  • Not all table engines support a primary index.
  • Clickhouse has no unique constraint support.

TODO

  • Tests
  • Documentation
  • Settings Options
  • Compression codecs

Pipboyguy added 19 commits March 6, 2024 11:33
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
Signed-off-by: Marcel Coetzee <[email protected]>
@Pipboyguy Pipboyguy linked an issue Mar 16, 2024 that may be closed by this pull request
4 tasks
Copy link

netlify bot commented Mar 16, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 244ed77
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/662b67a754d42e00086408aa

@sh-rp sh-rp force-pushed the 1055-implement-clickhouse-destination branch from 5691b65 to 65c9cec Compare April 23, 2024 16:21
@sh-rp
Copy link
Collaborator

sh-rp commented Apr 23, 2024

@Pipboyguy could you update the docs a bit to explain better

  • why there are two different ports used?
  • How to use gcs? A link to set up the ineroperability mode and instruction on how to provide the credentials for gcs are needed. This has changed now by the way and these credentials are part of the clickhouse config, so will be set like this:
    [destination.clickhouse.credentials]
    gcp_access_key_id="..."
    gcp_secret_access_key="..."

@sh-rp
Copy link
Collaborator

sh-rp commented Apr 24, 2024

Follow up ticket for filesystem gcs s3 interoperability mode: #1272

sh-rp and others added 4 commits April 24, 2024 17:23
# Conflicts:
#	dlt/destinations/sql_jobs.py
…ation' into 1055-implement-clickhouse-destination
Signed-off-by: Marcel Coetzee <[email protected]>
@Pipboyguy
Copy link
Collaborator Author

@Pipboyguy could you update the docs a bit to explain better

* why there are two different ports used?

* How to use gcs? A link to set up the ineroperability mode and instruction on how to provide the credentials for gcs are needed. This has changed now by the way and these credentials are part of the clickhouse config, so will be set like this:
  [destination.clickhouse.credentials]
  gcp_access_key_id="..."
  gcp_secret_access_key="..."

Added!

@sh-rp
Copy link
Collaborator

sh-rp commented Apr 25, 2024

Note to self: loading certain values (such as 12.7001) via jsonl will produce rounding errors predictably, tests need an update and we need a note in the docs.

@sh-rp sh-rp force-pushed the 1055-implement-clickhouse-destination branch from 7440f50 to 13f4b1c Compare April 25, 2024 13:10
sh-rp
sh-rp previously approved these changes Apr 26, 2024
@sh-rp sh-rp merged commit d2428ac into devel Apr 26, 2024
49 of 50 checks passed
@sh-rp sh-rp deleted the 1055-implement-clickhouse-destination branch April 26, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci full run the full load tests on pr
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Implement Clickhouse Destination
4 participants